Advanced Distribution Means for Spoken Language Corpora

نویسندگان

  • Christoph Draxler
  • José Soler
چکیده

This report outlines the distribution of Spoken Language Corpora on traditional CD-ROM media and a new approach via network. High capacity CD-ROMs are being introduced, but this is only a marginal improvement in respect to the distribution of SLC. Network access however offers many opportunities: customized SLC, on-line access, and a high degree of protection. However, for network access to be feasible, the bandwith of existing networks will have to be increased. Status of the abstract public

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Grammars of Spoken English: New Outcomes of Corpus-Oriented Research

Recently work on the grammar of spoken English has advanced through the use of large, general, and varied corpora of the language, including corpora of spoken discourse. Here I review the research that has been emerging from the availability of such corpora, much of it emphasizing the need for new ways of conceptualizing spoken grammar, to replace the traditional reliance on grammatical models ...

متن کامل

How Spoken Language Corpora Can Refine Current Speech Motor Training Methodologies

The growing availability of spoken language corpora presents new opportunities for enriching the methodologies of speech and language therapy. In this paper, we present a novel approach for constructing speech motor exercises, based on linguistic knowledge extracted from spoken language corpora. In our study with the Dutch Spoken Corpus, syllabic inventories were obtained by means of automatic ...

متن کامل

Treebank Profiling of Spoken and Written German

This paper profiles significant differences in syntactic distribution and differences in word class frequencies for two treebanks of spoken and written German: the TüBa-D/S, a treebank of transliterated spontaneous dialogs, and the TüBa-D/Z treebank of newspaper articles published in the German daily newspaper ’die tageszeitung’ (taz). The approach can be used more generally as a means of disti...

متن کامل

Concordancing for parallel spoken language corpora

Concordancing is one of the oldest corpus analysis tools, especially for written corpora. In NLP concordancing appears in training of speech-recognition system. Additionally, comparative studies of different languages result in parallel corpora. Concordancing for these corpora in a NLP context is a new approach. We propose to combine these fields of interest for a multi-purpose concordance for ...

متن کامل

Live Lexicons and Dynamic Corpora Adapted to the Network Resources for Chinese Spoken Language Processing Applications in an Internet Era

In the future network era, huge volume of information on all subject domains will be readily available via the network. Also, all the network information are dynamic, ever-changing and exploding. Furthermore, many of the spoken language processing applications will have to do with the content of the network information, which is dynamic. This means dynamic lexicons, language models and so on wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004